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Abstract. In this paper, we analyze the process of "assembUng" new matrix geometric means 
from existing ones, through function composition or limit processes. We show that for n = 4 a new 
matrix mean exists which is simpler to compute than the existing ones. Moreover, we show that 
for n > 4 the existing proving strategies cannot provide a mean computationally simpler than the 
existing ones. 
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1. Introduction. 

Literature review. In the last few years, several papers have been devoted to 
defining a proper way to generalize the concept of geometric mean to n > 3 Hcrmitian, 
positive definite mxm matrices. A seminal paper by Ando, Li and Mathias [1 defined 
the mathematical problem by stating ten properties that a "good" matrix geometric 
mean should satisfy. However, these properties do not uniquely define a multivariate 
matrix geometric mean; thus several different definitions appeared in literature. 

Ando, Li and Mathias [T] first proposed a mean whose definition for n matrices 
is based on a limit process involving several geometric means of n — 1 matrices. Later 
Bini, Meini and Poloni [3] noted that the slow convergence speed of this method pre- 
vents its use in applications; its main shortcoming is the fact that its complexity grows 
as 0(71!) with the number of involved matrices. In the same paper, they proposed 
a similar limit process with increased convergence speed, but still with complexity 
0{n\). Palfia [9 proposed a mean based on a similar process involving only means 
of 2 matrices, and thus much simpler and cheaper to compute, but lacking property 
P3 (permutation invariance) from the ALM list. Lim [6] proposed a family of matrix 
geometric means that are based on an iteration requiring at each step the computa- 
tion of a mean of to > n matrices. Since the computational complexity for all known 
means greatly increases with n, the resulting family is useful as an example but highly 



•Received by the editors on Month x, 200x. Accepted for publication on Month y, 200y Handling 
Editor: . 

^Scuola Normale Superiore, Piazza dci Cavalieri, 7, 56126 Pisa, Italy (f.poloni@sns.it). 

1 



2 



F. Poloni 



impractical for numerical computations. 

At the same time, Moakher [7j |8] and Bhatia and Holbrook [3j [2] proposed 
a completely different definition, which we shall call the Riemannian centroid of 
Ai, A2, ■ . ■ , An- The Riemannian centroid G^{Ai, A2, . . . , An) is defined as the min- 
imizer of a sum of squared distances, 

n 

G'^{A^,A2,...,An) = &TgmmY,S^{A^,X), (1.1) 

1=1 

where S is the geodesic distance induced by a natural Riemannian metric on the space 
of symmetric positive definite matrices. The same X is the unique solution of the 
equation 

n 

^log(A-^X) = 0, (1.2) 

1=1 

involving the matrix logarithm function. While most of the ALM properties are easy 
to prove, it is still an open problem whether it satisfies P4 (monotonicity). The 
computational experiments performed up to now gave no counterexamples, but the 
monotonicity of the Riemannian centroid is still a conjecture [HI, up to our knowledge. 

Moreover, while the other means had constructive definitions, it is not apparent 
how to compute the solution to either (|l.ip or (II. 2p . Two methods have been proposed, 
one based on a fixed-point iteration [8] and one on the Newton methods for manifolds 
[9j[8]. Although both seem to work well on "tame" examples, their computational 
results show a fast degradation of the convergence behavior as the number of matrices 
and their dimension increase. It is unclear whether on more complicated examples 
there is convergence in the first place; unlike the other means, the convergence of 
these iteration processes has not been proved, as far as we know. 

Notations. Let us denote by Pm the space of Hermitian positive-definite m x m 
matrices. For all A,B ^ P„j, we shall say that A < B {A < B) ii B — A is positive 
definite (semidefinite). With A* we denote the conjugate transpose of A. We shall say 
that A = {Ai)f^-^ e (Pm)" is a scalar n-tuple of matrices ii Ai = A2 = ■ ■ ■ = An- We 
shall use the convention that both Q{A) and Q{Ai, . . . ,An) denote the application of 
the map Q : (Pm)" ^ Pm to the n-tuple A. 

ALM properties. Ando, Li and Mathias |T] introduced ten properties defining 
when a map G : (P,n)" ^ Pjn can be called a geometric mean. Following their 
paper, we report here the properties for n = 3 only, for the sake of simplicity; the 
generalization to different values of n is straightforward. 

PI (consistency with scalars) If A, B, G commute then G{A,B,C) = {ABC^/^. 
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PI' This implies G{A,A,A) = A. 

P2 (joint homogeneity) G{aA, l3B,-fC) = {ap-/V^^G{A, B,C), for each a,^,7 > 0. 
P2' This imphes G{aA,aB,aC) = aGiA,B,C). 

P3 (permutation invariance) G{A, B,C) = G{'!t{A, B,C)) for all the permutations 
TTiA,B,C) of A, B, C. 

P4 (monotonicity) G{A, B, G) > G{A', B' , C") whenever A>A',B>B',C> C . 

P5 (continuity from above) If A„, Bn, Gn are monotonic decreasing sequences con- 
verging to A, B, C, respectively, then G{An,Bn, C„) converges to G{A, B, G). 

P6 (congruence invariance) G{S* AS, S* BS, S*G S) = S*G{A,B,G)S for any non- 
singular S. 

P7 (joint concavity) It A ^ XAi + {l-X)A2, B = XBi + {l-X)B2, G = ACi + (l-A)C2, 

then GiA, B, G) > AG(Ai, Bi, Ci) + (1 - A)G(A2, ^2, G2). 
P8 (self-duality) G{A,B,G)-^ ^ G{A-\ B-\G-^). 
P9 (determinant identity) det G{A,B,G) = (det A det B det C)^/^. 
PIO (arithmetic-geometric-harmonic mean inequality) 

The matrix geometric mean for n = 2. For ri = 2, the ALM properties uniquely 
define a matrix geometric mean which can be expressed explicitly as 

A#B := A{A-^B)^/^. (1.3) 

This is a particular case of the more general map 

A#tB := A{A-^BY, teR, (1.4) 

which has a geometrical interpretation as the parametrization of the geodesic joining 
A and B for a certain Riemannian geometry on .2 . 



The ALM and BMP means. Ando, Li and Mathias pj recursively define a matrix 
geometric mean G^^*^ of n matrices in this way. The mean G'^'^^' of two matrices 
coincides with (|1.3p : for n > 3, suppose the mean of n — 1 matrices G^^f'^ is already 
defined. Given Ai, . . . , An, compute for each j = 1, 2, . . . 

Al^+i) G^^^HA^f^ ,A'\--- ■■■MP) ^ = 1, . . . , n, (1.5) 

where := Ai, i = 1,. . .n. The sequences (Ap'')j^i converge to a common (not 
depending on i) matrix, and this matrix is a geometric mean of A^^\ . . . , aI^\ 

The mean proposed by Bini, Meini and Poloni 4 is defined in the same way, but 
with (II. 5p replaced by 

:= G^^(A^', . . . A^„a21„ . . . AW) #,/„ A ^ = l,...,r^. (1.6) 
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Though both maps satisfy the ALM properties, matrices A,B,C exist for which 
GAlmi^ j^^ B, C) + G^^^P{A, B, C). 

While the former iteration converges hnearly, the latter converges cubically, and 
thus allows one to compute a matrix geometric mean with a lower number of iterations. 
In fact, if we call pk the average number of iterations that the process giving a mean of 
k matrices takes to converge (which may vary significantly depending on the starting 
matrices), the total computational cost of the ALM and BMP means can be expressed 
as 0{nlp3p4 . . .pnm^)- The only difference between the two complexity bounds lies in 
the expected magnitude of the values pk ■ The presence of a factorial and of a linear 
number of factors pk is undesirable, since it means that the problem scales very badly 
with n. In fact, already with n = 7, 8 and moderate values of m, a large CPU time is 
generally needed to compute a matrix geometric mean [4]. 

The Pdlfia mean. Palfia .9 proposed to consider the following iteration. Let 
again A^^"^ :— Ai, i — 1, . . . ,n. Let us define 

^ = (1.7) 

where the indices are taken modulo n, i.e., A^^^ = A[''^ for all k. We point out 
that the definition in the original paper [9] is slightly different, as it considers several 
possible orderings of the input matrices, but the means defined there can be put in 
the form (|1.7|) up to a permutation of the starting matrices Ai, . . . , An- 

As for the previous means, it can be proved that the iteration (jl.7l) converges to 
a scalar n-tuple; we call the common limit of all components G^{Ai, . . . , An). As we 
noted above, this function does not satisfy P3 (permutation invariance), and thus it 
is not a geometric mean in the ALM sense. 

Other composite means. Apart from the Riemannian centroid, all the other defi- 
nitions follow the same pattern: 

• build new functions of n matrices by taking nested compositions of the exist- 
ing means — preferably using only means of less than n matrices; 

• take the common limit of a set of n functions defined as in the above step. 

The possibilities for defining new iterations following this pattern are endless. Ando, 
Li, Mathias, and Bini, Meini, Poloni chose to use in the first step composite functions 
using computationally expensive means of n — 1 matrices; this led to poor convergence 
results. Palfia chose instead to use more economical means of two variables as starting 
points; this led to better convergence (no 0(n!)), but to a function which is not 
symmetric with respect to permutations of its entries (P3, permutation invariance). 
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As we shall see in the following, the property P3 is crucial: all the other ones are 
easily proved for a mean defined as composition/limit of existing means. 

A natural question to ask is whether we can build a matrix geometric mean of 
n matrices as the composition of matrix means of less matrices, without the need of 
a limit process. Two such unsuccessful attempts are reported in the paper by Ando, 
Li and Mathias [1], as examples of the fact that it is not easy to define a matrix 
satisfying Pl-PlO. The first is 



Unfortunately, there are matrices such that (A # B) #(C # £>) 7^ (A # C) #(B # I?), 
so P3 fails. A second attempt is 



where the exponents are chosen so that PI (consistency with scalars) is satisfied. 
Again, this function is not symmetric in its arguments, and thus fails to satisfy P3. 

A second natural question is whether an iterative scheme such as the ones for 
qAlm ^ qBMP qP ^g^^ yield P3 without having a 0{n\) computational cost. For 
example, if we could build a scheme similar to the ALM and BMP ones, but using 
only means of ^ matrices in the recursion, then the 0{n\) growth would disappear. 

In this paper, we aim to analyze in more detail the process of "assembling" new 
matrix means from the existing ones, and show which new means can be found, 
and what cannot be done because of group-theoretical obstructions related to the 
symmetry properties of the composed functions. By means of a group-theoretical 
analysis, we will show that for n = 4 a new matrix mean exists which is simpler to 
compute than the existing ones; numerical experiments show that the new definition 
leads to a significant computational advantage. Moreover, we will show that for n > 4 
the existing strategies of composing matrix means and taking limits cannot provide 
a mean which is computationally simpler than the existing ones. 

2. Quasi-means and notation. 

Quasi-means. Let us introduce the following variants to some of the Ando-Li- 
Mathias properties. 

PI" Weak consistency with scalars. There are a,/?, 7 S K such that if A,B,C 
commute, then G'(A, B, C) = A'^B'^C'^. 

P2" Weak homogeneity. There are a,/?, 7 G K such that for each r,s,t > 0, 
G{rA,sB,tC) = r"s'^rG{A,B,C). Notice that if PI" holds as well, these 
must be the same a, /3,7 (proof: substitute scalar values in PI"). 



G^^''{A,B,C,D) := {A#B)#{C#D). 



(1-8) 



C'^'iA, B, C) := {A'^/^ # # C2/^ 



(1-9) 



6 



F. Poloni 



P9' Weak determinant identity. For all d > 0, if dctA ~ det B ~ det C = d, then 
det G{A, B, C) = d. 

We shall call a quasi-mean a function Q : (Pm)" ^ (Pm) that satisfies P1",P2", 
P4, P6, P7, P8, P9'. This models expressions which are built starting from basic 
matrix means but are not symmetric, e.g., A^G{B,C, D ^ E), (jl.Sp . and (|1.9p . 

Theorem 2.1. If a quasi-mean Q satisfies P3 (permutation invariance), then it 
is a geometric mean. 

Proof. From P2" and P3, it follows that a = P = -f. From P9', it follows that if 
det^l = detB = detC = 1, 

2'" = detQ(2A,2B,2C) = det (2"+'^+'^Q(A, B, C)) = 2™("+'3+'^), 

thus a + /3 + 7 = 1. The two relations combined together yield a — (3 = ^ = l/i. 
Finally, it is proved in Ando, Li and Mathias [T] that P5 and PIO are implied by the 
other eight properties P1-P4 and P6-P9. □ 

For two quasi-means Q and Roin matrices, we shall write Q = i? if Q{A) ~ R{A) 
for each n-tuple A G 

Group theory notation. The notation H < G {H < G) means that is a sub- 
group (proper subgroup) of G. Let us denote by (5„ the symmetric group on n 
elements, i.e., the group of all permutations of the set {l,2,...,n}. As usual, the 
symbol (010203 ... Ofe) stands for the permutation ("cycle") that maps oi 1— 02, 

0. 2 ^ 03, •••Ofc-i ^ o-k, o-k '-^ 0,1 and leaves the other elements of {l,2,...n} 
unchanged. Different symbols in the above form can be chained to denote the group 
operation of function composition; for instance, a = (13) (24) is the permutation 
(1, 2, 3, 4) (3, 4, 1, 2). We shall denote by 2l„ the alternating group on n elements, 

1. e., the only subgroup of index 2 of 6„, and by S)„ the dihedral group over n ele- 
ments, with cardinality 2n. The latter is identified with the subgroup of 6„ generated 
by the rotation (1, 2, . . . , n) and the mirror symmetry (2, n)(3, n — 1) • • • (ri/2, n/2 + 2) 
(for even values of n) or (2, n)(3, n — 1) • • • ((n + l)/2, (n + 3)/2) (for odd values of n). 

Coset transversals. Let now H < 6„, and let {di, . . . , ct,.} C 6,1 be a transversal 
for the right cosets Ha, i.e., a set of maximal cardinality r = n\/\H\ such that (Jjcr~^ ^ 
H for all i j. The group ©„ acts by permutation over the cosets {Hai, . . . , Hdr), 
i.e., for each a there is a permutation r Ph{<^) such that 

{Haia,...,Hara) = {Har{i), . . . , Har(r))- 

It is easy to check that in this case pn '. &n ^ &r must be a group homomorphism. 
Notice that if is a normal subgroup of (3„, then the action of (3„ over the coset 
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space is represented by the quotient group 6„/iJ, and the kernel of pu is H. 

Example 2.2. The coset space of H = 2)4 has size 4!/8 = 3, and a possible 
transversal is ai — e, 02 = (12), CT3 = (14). We have pni&i) — &3: indeed, the 
permutation a ~ (12) G ©4 is such that (Haia, Ha20', Ha^a) = {Ha2, Hai, Ha^), 
and therefore Ph{<^) = (12), while the permutation a — (14) G 64 is such that 
{H aia , H , H asa) = {Has, Ha2, Hai), therefore ph{o) = (13). Thus ph{&a) 
must he a subgroup of &3 containing (12) and (13), that is, 63 itself. 

With the same technique, noting that cr~^aj maps the coset Hai to Haj, we can 
prove that the action pn of 6„ over the coset space is transitive. 

Group action and composition of quasi-means. We may define a right action of 
on the set of quasi-means of n matrices as 



The choice of putting a to the right, albeit slightly unusual, was chosen to simplify 
some of the notations used in Section SI 

When Q is a quasi- mean of r matrices and Ri, R2, ■ ■ ■ Rr are quasi-means of n 
matrices, let us define Q o i?2, • ■ • Rr) as the map 



Theorem 2.3. Let Q{Ai, . . . , A^) and Rj{Ai,. . . An) (for j ^ 1, . . . ,r) be quasi- 
means. Then, 

1. For all a G (5^, Qc is a quasi-mean. 

2. {Ai, . . . , Aj., Aj-^i) I— 5- Q{Ai, . . . , Aj.) is a quasi-mean. 

3. Q o (Ri, R2, . . . Rr) is a quasi-mean. 

Proof. All properties follow directly from the monotonicity (P4) and from the 
corresponding properties for the means Q and Rj. □ 

We may then define the isotropy group, or stabilizer group of a quasi-mean Q 



{Qa){Ai,...,An) Q(A^(i), . . . , A^(„)). 



(Q o i?2, . . . Rr)) (A) Q{Ri{A), R2{A), . . . , Rr{A)). 



(2.1) 



Stab(Q) := {a e e" : Q = Qa}. 



(2.2) 



3. Means obtained as map compositions. 



Reductive symmetries. Let us define the concept of reductive symmetries of a 
quasi-mean as follows. 
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• in the special case in which G2{A, B) ^ B, the symmetry property that 
A^B = B^A is a, reductive symmetry. 

• let Qo . . . , R^) be a quasi-mean obtained by composition. The symmetry 
with respect to the permutation a (i.e., the fact that Q = Qa) is a reductive 
symmetry for Qo {Ri, . . . , Rr) if this property can be formally proved relying 
only on the reductive symmetries of Q and Ri, . . . , R,-. 

For instance, if we take Q{A, B,C) := A^{B ^ C), then we can deduce that 
Q{A, B, C) = Q{A, C, B) for aU A, S, C, but not that Q{A, B, C) = Q{B, C, A) for ah 
A, B, C . This does not imply that such a symmetry property does not hold: if we were 
considering the operator + instead of then it would hold that A+B+C = B+C+A, 
but there are no means of proving it relying only on the commutativity of addition 
— in fact, associativity is crucial. 

As we stated in the introduction, Ando, Li and Mathias [1] showed explicit coun- 
terexamples proving that all the symmetry properties of G^^<^'^ and G"''^'^ are reductive 
symmetries. We conjecture the following. 

Conjecture 1. All the symmetries of a quasi-mean obtained by recursive com- 
position from G2 are reductive symmetries. 

In other words, we postulate that no "unexpected symmetries" appear while 
examining quasi- means compositions. This is a rather strong statement; however, 
the numerical experiments and the theoretical analysis performed up to now never 
showed any invariance property that could not be inferred by those of the underlying 
means. 

We shall prove several result limiting the reductive symmetries that a mean can 
have; to this aim, we introduce the reductive isotropy group 

RStab((5) = {(7 e Stab((3) : Q = Qa is a reductive symmetry}. (3-1) 

We will prove that there is no quasi-mean Q such that RStab((3) = (3„. This shows 
that the existing "tools" in the mathematician's "toolbox" do not allow one to con- 
struct a matrix geometric mean (with full proof) based only on map compositions; 
thus we need either to devise a completely new construction or to find a novel way to 
prove additional invariance properties involving map compositions. 

Reduction to a special form. The following results show that when looking for a 
reductive matrix geometric mean, i.e., a quasi-mean Q with RStabQ = ©n; we may 
restrict our search to quasi-means of a special form. 

Theorem 3.1. Let Q be a quasi-mean of r + s matrices, and Ri, R2, . . . , Rr, 
Si, 82, . . . , Ss be quasi-means of n matrices such that Ri 7^ SjU for all i, j and every 
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a G ©„. Then, 

RStab(Q o i?2, . . . , i?., ^2, . . . , Ss)) 

C RStab(Qo (3.2) 



Proof. Let cr e RStab((5o(_Ri, R2, . . . , Rr, Si, S2, ■ ■ ■ Ss)); since the only invariance 
properties that we may assume on Q arc those predicted by its invariance group, it 
must be the case that 

(i?icr, i?20', ■ • • , RrCr, 5*1 cr, S20; . . . SsCr) 

is a permutation of i?2, • • • , Rr, Si, S2, ■ ■ ■ Ss) belonging to RStab((3). Since Ri ^ 
Sj(j, this permutation must map the sets {Ri,R2, . . . ,Rr\ and ^2, . . . ^s} to 
themselves. Therefore, the same permutation maps 

{Ri, R2, ■ ■ ■ , Rr, Ri, Ri, ■ ■ ■ Ri) 

to 

(i?lCT, i?20', • ■ • , RrCr, Rl<J, Rl<7, ■ ■ ■ Rlo). 

This implies that 

Q{Ri, R2, ■ ■ ■ , Rr, Ri, Ri, ■ ■ ■ Ri) — Q{Ri<j, R2(j, . . . , RrO, RiCT, Rio, . . . Rio) 
as requested. □ 

Theorem 3.2. Let Mi := Q o (Ri, R2, . . . , Rr) be a quasi-mean. Then there is a 
quasi-mean M2 in the form 

Q o {Rai,R(T2,. . ■ ,R(yf), (3.3) 

where {ai,a2, ■ . ■ ,<Jf) is a right coset transversal for RStab(i?) in &„, such that 
RStab(Afi) C RStab(Af2). 

Proof. Set R ~ Ri. For each i = 2, 3, . . . , r if i?^ ^ Ra, we may replace it with R, 
and by Theorem 13.11 the restricted isotropy group increases or stays the same. Thus 
by repeated application of this theorem, we may reduce to the case in which each Ri 
is in the form Rri for some permutation r^. 

Since {ffi} is a right transversal, we may write = ^iC^fi) for some hi £ H and 
k{i) G {l,2,...,f}. We have Rh = R since h e Stabi?, thus Ri = Ra^iy The 
resulting quasi-mean is Q o {Ra^i-^, . . . , RiJk(^^^-^). Notice that we may have k{i) = 



10 



F. Poloni 



k{j), or some cosets may be missing. Let now Q be defined as Q{Ai, A2, . . . , Af) := 
Q(Afe(i), . . . , Ak(r)); then we have 

Q{Rai, . . . ,R(jf) = Q{Rak(i),...,Rak{r)) (3.4) 

and thus the isotropy groups of the left-hand side and right-hand side coincide. □ 

For the sake of brevity, we shall define 

Q o R :— Q o (Rai, . . . , R<Jr), 

assuming a standard choice of the transversal for H = Stab R. Notice that Q o R 
depends on the ordering of the cosets Hai , ■ ■ ■ , Har , but not on the choice of the 
coset representative ct^, since Qhai — Qui for each h G H. 

Example 3.3. The quasi-mean {A,B,C) ^ {A4k B) ij^{B 4j^C) isQoQ, where 
Q{X, Y,Z) ^ X#Y, H ^ {e, (12)}, and the transversal is {e, (13), (23)}. 

Example 3.4. The quasi-mean {A,B,C) i-> {Ajj^ B)ij^C is not in the form 
p.3p . hut in view 0/ Theorem \3.1\ its restricted isotropy group is a subgroup of that 
of{A,B,C)^{A#B)#{A#B). 

The following theorem shows which permutations we can actually prove to belong 
to the reductive isotropy group of a mean in the form p.3p . 

Theorem 3.5. Let H < S„, R be a quasi-mean of n matrices such that 
RStabi? — H and Q be a quasi-mean of r = nl/\H\ matrices. Let G £ ©„ be the 
largest permutation subgroup such that ph{G) < RStab((5). Then, G = RStab((5 o 
R). 

Proof. Let a £ G and r ~ pH{<y); we have 

{QoR)a{A) = Q{Rai(j{A), Ra2<T{A), . . . , R<Tr<T{A)) 

= Q(i?a^(i)(4), R<Jr(2){A), . . .,Rar(r){A)) 
= Q{R(Ji{A), R<J2{A), ...,R<Jr{A)), 

where the last equality holds because r S Stab((5). 

Notice that the above construction is the only way to obtain invariance with 
respect to a given permutation a: indeed, to prove invariance relying only on the 
invariance properties of Q, {Raia, . . . , Rarcr) = {Ra^-^i^, . . . , Ra^i^j.-^) must be a per- 
mutation of (Rai, . . . , Rar) belonging to RStabQ, and thus ph{(^) = t G StabQ. 
Thus the reductive invariance group of the composite mean is precisely the largest 
subgroup G such that ph{G) < StabQ. □ 

Example 3.6. Let n — A, Q be any (reductive) geometric mean of three matrices 
(i.e., RStabQ = Gs.), and R{A,B,C,D) := {A#C)#{B#D). We have H = 
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RStab i? = 2)4, the dihedral group over four elements, with cardinality 8. There are 
r = 4!/|i7| = 3 cosets. Since pni&i) is a subset o/StabQ = 63, the isotropy group 
of Q o R contains G = ©4 by Theorem \3.5\ Therefore Q o R is a geometric mean of 
four matrices. 

Indeed, the only assertion we have to check expUcitly is that RStabi? = 2)4. The 
isotropy group of R contains (24) and (1234), since by using repeatedly the fact that 
^ is symmetric in its arguments we can prove that R{A, B,C, D) — R{A, D,C, B) 
and R{A, B, C, D) = R{D, A, B, C). Thus it must contain the subgroup generated by 
these two elements, that is, < RStab R. The only subgroups of 64 containing 2)4 
as a subgroup are the two trivial ones ©4 and 2)4. We cannot have RStabi? = 64, 
since R has the same definition as G^^'^'^ of equation <\1.8\i . apart from a reordering, 
and it was proved [1] that this is not a geometric mean. 

It is important to notice that by choosing G3 — Gf^'^^ or G3 = G^^'^ in the 
previous example we may obtain a geometric mean of four matrices using a single 
limit process, the one needed for G^. This is more efficient than Gf^*-^ and Gf^^^, 
which compute a mean of four matrices via several means of three matrices, each of 
which requires a limit process in its computation. We will return to this topic in 
Section [5l 



Above four elements. Is it possible to obtain a reductive geometric mean of n 
matrices, for n > 4, starting from simpler means and using the construction of The- 
orem [3?5p The following result shows that the answer is no. 

Theorem 3.7. Suppose G RStab((5 o R) > 2t„ and n > A. Then 2l„ < 
RStab(g) or 2l„ < RStab(i?). 

Proof. Let us consider K — kerpjj. It is a normal subgroup of 6„, but for n > 4 
the only normal subgroups of ©„ are the trivial group {e}, 2t„ and ©„ [5]. Let us 
consider the three cases separately. 

\. K = {e}. In this case, ph{G) ^ G, and thus G < RStab Q. 

2. K = ©„. In this case, pH{&n) is the trivial group. But the action of ©„ over 
the coset space is transitive, since cr~^aj sends the coset Hai to the coset 
Haj. So the only possibility is that there is a single coset in the coset space, 
i.e., H = ©„. 

3. K = 2l„. As in the above case, since the action is transitive, it must be the 
case that there are at most two cosets in the coset space, and thus H = ©„ 
01 H = 2t„. 

□ 
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Thus it is impossible to apply Theorem 13.51 to obtain a quasi-mean with reductive 
isotropy group containing 2l„, unless one of the two starting quasi-means has a re- 
ductive isotropy group already containing 2l„. 



4. Means obtained as limits. 



An algebraic setting for limit means. We shall now describe a unifying algebraic 
setting in terms of isotropy groups, generalizing the procedures leading to the means 
defined by hmit processes G^^^, G^^^^ and G^. 

Let S : (Pm)" — !> (Pm)" be a map; we shall say that S preserves a subgroup 
< 6„ if there is a map t : H H such that Sh{A) = T{h)S{A) for all A e P,„. 

Theorem 4.1. Let S : (P™)" (Pm)" be a map and H < &n be a permutation 
group such that 

1- (A) ~^ a quasi-mean for all i — 1, . . . , 

2. S preserves H , 

3. for all A G (P,„)", hnife^oo S''{A) is a scalar n- 

and let us denote by S°^{A) the common value of all entries of the scalar n-tuple 
limfc^oo S^'l-A). Then, S°°{A) is a quasi-mean with isotropy group containing H . 

Proof From Theorem O it follows that A ^ {S^{A)). is a quasi-mean for each 
k. Since all the properties defining a quasi-mean pass to the limit, 5*°° is a quasi-mean 
itself. 

Let us take h ^ H and A G P„. It is easy to prove by induction on k that S'^h{A) = 
T^{h) (5''^ (A)). Now, choose a matrix norm inducing the Euclidean topology on P„i; 
let e > be fixed, and let us take K such that for aW k > K and for alH = 1, . . . , n 
the following inequalities hold: 

• \\{S\A))^~S^{A)\\<e, 

• WiS^'hiA)).^ S°°h{A)\\ <e. 

We know that {S''h{A))^ = (t'^WSHA)) , - (SHA)) ^.^^.uy therefore 



ll^-(A) - S-^h{A)\\ < {S'iA)) - S^{A) 



+ \\{S''h{A))^- S"°h{A)\\ < 2e. 



Since e is arbitrary, the two limits must coincide. This holds for each h ^ H, therefore 



iRere denotes function iteration: = S and S'^+^{A) = 5(5* (A)) for all k. 
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H < Stab 5°°. □ 

EXAIMPLE 4.2. The map S defining Cf^*^ is 



-A- 




B 




C 




_D_ 





■qALM 
qALM 
qALM 
qALM 



{B,C,Dy 
{A, CD) 
{A,B,D) 
{A,B,C). 



One can see that So = a^^S for each a G 64, and thus with the choice t((t) a^^ 
we get that S preserves ©4. Thus, by Theorem \4.1\ S°° — Gf^^^ is a geometric mean 
of four matrices. The same reasoning applies to G^^^^ . 

ExAiviPLE 4.3. The map S defining G^ is 



"A" 




-A#B- 


B 




B#G 


G 


G#D 


_D_ 




D#A_ 



S preserves the dihedral group 2)4. Therefore, provided the iteration process converges 
to a scalar n-tuple, S°° is a quasi-mean with isotropy group containing 2)4. 



Efficiency of the limit process. As in the previous section, we are interested in 
seeing whether this approach, which is the one that has been used to prove invariance 
properties of the known hmit means [1] 3] , can yield better results for a different map 
S. 

Theorem 4.4. Let S : (Pm)" (Pm)" preserve a group H . Then, the invariance 
group of each of its components Si, i — 1, ... ,n, is a subgroup of H of index at most 
n. 

Proof. Let i be fixed, and set Ik ■— {h ^ H : t(/i)(z) = k}. The sets Ik are 
mutually disjoint and their union is H, so the largest one has cardinality at least 
\H\/n, let us call it 1^. 

From the hypothesis that S preserves H, we get Sih{A) — 5'j(-A) for each A and 
each h e Ik. Let h be an element of Ik] then Sih{h~'^A) = Sk{h"'^A) = Si{A). Thus 
the isotropy group of Si contains all the elements of the form hh^^ , h ^ Ik, and those 
are at least 1-^1/?^-. □ 

The following result holds f5^, page 147]. 

Theorem 4.5. For n> 4, the only subgroups of @n with index at most n are: 
• the alternating group 2ln, 
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• the n groups Tfe = {a E ©„ : cr(fc) — k}, k — l,...,n, all of which are 
isomorphic to ©„-!> 

• for n = 6 only, another conjugacy class of 6 subgroups of index 6 isomorphic 
to 65. 

Analogously, the only subgroups of 2l„ with index at most n are: 

• the n groups Uk = {cr G Sin : cr(fc) — k}, k — l,...,n, all of which are 
isomorphic to 2t„_i, 

• for n = 6 only, another conjugacy class of 6 subgroups of index 6 isomorphic 
to 215. 

This shows that whenever we try to construct a geometric mean of n matrices 
by taking a hmit processes, such as in the Ando-Li-Mathias approach, the isotropy 
groups of the starting means must contain 2t„_i. On the other hand, by Theorem l3.5l 
we cannot generate means whose isotropy group contains 2l„_i by composition of 
simpler means; therefore, there is no simpler approach than that of building a mean 
of n matrices as a limit process of means of n — 1 matrices (or at least quasi- means with 
Stab Q = 2t„_i, which makes little difference). This shows that the recursive approach 
of G^^*^ and G^'^^^ cannot be simplified while still maintaining P3 (permutation 
invariance). 

5. Computational issues and numerical experiments. 



A faster mean of four matrices. The results we have exposed up to now are 
negative results, and they hold for n > A. On the other hand, it turns out that for 
n = 4, since 2l„ is not a simple group, there is the possibility of obtaining a mean that 
is computationally simpler than the ones in use. Such a mean is the one we described 
in Example 13.61 Let us take any mean of three elements (we shall use Gf^^^ here 
since it is the one with the best computational results); the new mean is therefore 
defined as 

G™(A, B, C, D) G^'^P {{A # B) i^{G ifD),{A#C) j^{B # D), 

{A^D)#{B#G)). (5.1) 

Notice that only one limit process is needed to compute the mean; conversely, when 
computing Gf^^^ or Gf^^ we are performing an iteration whose elements are com- 
puted by doing four additional limit processes; thus we may expect a large saving in 
the overall computational cost. 

We may extend the definition recursively to n > 4 elements using the con- 
struction described in ([LB]) , but with G^^^ instead of G^*^^. The total com- 
putational cost, computed in the same fashion as for the ALM and BMP means, is 
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Data set (number of matrices) BMP mean New mean 

NaClOa (5) 1.3E+00 3.1E-01 

Ammonium dihydrogen phosphate (4) 3.5E-01 3.9E-02 

Potassium dihydrogen phosphate (4) 3.5E-01 3.9E-02 

Quartz (6) 2.9E+01 6.7E+00 

RocheUe sah (4) 6.0E-01 5.5E-02 

Table 5.1 
CPU times for the elasticity data sets 



BMP mean New mean 
Outer iterations {n — 4) 3 none 
Inner iterations (n = 3) 4 x 2.0 (avg.) per outer iteration 2 
Matrix square roots (sqrtm) 72 15 
Matrix p-th roots (rootm) 84 6 

Table 5.2 



Number of inner and outer iterations needed, and number of matrix roots needed ( ammonium 
dihydrogen phosphate) 



0{nlp3PQPe . . .p„m^). Thus the undesirable dependence from nl does not disappear; 
the new mean should only yield a saving measured by a multiplicative constant in the 
complexity bound. 

Benchmarks. We have implemented the original BMP algorithm and the new one 
described in the above section with MATLAB@ and run some tests on the same set 
of examples used by Moakher [8] and Bini et al. [4 . It is an example deriving from 
physical experiments on elasticity. It consists of five sets of matrices to average, with 
n varying from 4 to 6, and 6x6 matrices split into smaller diagonal blocks. 

For each of the five data sets, we have computed both the BMP and the new 
matrix mean. The CPU times are reported in Table 15.11 As a stopping criterion for 
the iterations, we used 

max 

As we expected, our mean provides a substantial reduction of the CPU time which 
is roughly by an order of magnitude. 

Following Bini et al. [3], we then focused on the second data set (ammonium 
dihydrogen phosphate) for a deeper analysis; we report in Table [5] the number of 
iterations and matrix roots needed in both computations. 

The examples in these data sets are mainly composed of matrices very close to 
each other; we shall consider here instead an example of mean of four matrices whose 



jk 



< 10 



-13 



16 



F. Poloni 



BMP mean New mean 
Outer iterations {n — 4) 4 none 
Inner iterations (n = 3) 4 x 2.5 (avg.) per outer iteration 3 
Matrix square roots (sqrtm) 120 18 
Matrix p-th roots (rootm) 136 9 

Table 5.3 



Number of inner and outer iterations needed, and number of matrix roots needed 



Operation Result 



\\G^^P{M-'^,M,M^,M^) - M\\^ 4.0E-14 

\\G'^E^{M-^,M,1VP,M^) - M\\^ 2.5E-14 

\det{G^^'P{A,B,C,D)) ~ (det(A) det(B) det(C) detp))!/"! 5.5E-13 

\det{G^^'^P{A, B, G, D)) - (det(y4) det(B) det(C) dct(i:'))i/''| 2.1E-13 

Table 5.4 
Accuracy tests 



mutual distances are larger: 





"1 





0" 




"3 





" 




"2 


1 


1" 




' 20 





-10' 







1 





, B = 





4 





, c = 


1 


2 


1 


, D = 





20 













1 










100 




1 


1 


2 




-10 





20 



(5.2) 

The results regarding these matrices are reported in Table 15.31 



Accuracy. It is not clear how to check the accuracy of a limit process yielding 
a matrix geometric mean, since the exact value of the mean is not known a priori, 
apart from the cases in which all the Ai commute. In those cases, PI yields a compact 
expression for the result. So we cannot test accuracy in the general case; instead, we 
have focused on two special examples. 

As a first accuracy experiment, we computed G{M~'^ , M, M"^ , M^) — M, where 
M is taken as the first matrix of the second data set on elasticity; the result of this 
computation should be zero according to PI. As a second experiment, we tested the 
validity of P9 (determinant identity) on the means of the four matrices in ()5.2p . The 
results of both computations are reported in Table \5A\ the results are well within the 
errors permitted by the stopping criterion, and show that both algorithms can reach 
a satisfying precision. 



6. Conclusions. 
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Research lines. The results of this paper show that, by combining existing matrix 
means, it is possible to create a new mean which is faster to compute than the existing 
ones. Moreover, we show that using only function compositions and limit processes 
with the existing proof strategies, it is not possible to achieve any further significant 
improvement with respect to the existing algorithms. In particular, the dependency 
from n\ cannot be removed. New attempts should focus on other aspects, such as: 

• proving new "unexpected" algebraic relations involving the existing matrix 
means, which would allow to break out of the framework of Theorem 13.51 - 
Theorem 14.11 

• introducing new kinds of matrix geometric means or quasi-means, different 
from the ones built using function composition and limits. 

• proving that the Riemannian centroid ()1.1|) is a matrix mean in the sense of 
Ando-Li-Mathias (currently P4 is an open problem) , or providing faster and 
reliable algorithms to compute it. 

It is an interesting question whether it is possible to construct a quasi-mean whose 
isotropy group is exactly 2l„. 
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